Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Fix caching table with metadata table names #11123

Closed
wants to merge 1 commit into from

Conversation

manuzhang
Copy link
Collaborator

@manuzhang manuzhang commented Sep 13, 2024

When CachingCatalog finds a table having a MetadataTableType as its identifier name, it will load its namespace as a normal table. However, spark_catalog.default.history is a valid table identifier, although having MetadataTableType as its name. In this case, CachingCatalog throws NoSuchTableException currently. This PR catches the exception and load the table as a normal table.

cc @wypoon @ajantha-bhat @nastra

@manuzhang
Copy link
Collaborator Author

@nastra @ajantha-bhat @RussellSpitzer Please take a look when you find time.

import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.TestTemplate;

public class TestCachingTableWithMetaTableName extends CatalogTestBase {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that there is an existing test suite, TestCachingCatalog. Wouldn't it be cleaner to add test coverage there instead of to the Spark test suite?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That test is to test CachingCatalog itself. This test is to test caching behavior in a real catalog.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that, but I think this patch did change the CachingCatalog internals itself, I don't see what is the additional value in putting this into a Spark test. TestCachingCatalog should definitely have coverage for behaviours like this. If you feel that an additional Spark test is also beneficial, I don't mind. I might miss some background, though.

Table originTable = tableCache.get(originTableIdentifier, catalog::loadTable);

// share TableOperations instance of origin table for all metadata tables, so that metadata
// table instances are
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I know there was only an indentation change here, but there is an odd linebreak in this line

@@ -145,22 +146,26 @@ public Table loadTable(TableIdentifier ident) {
}

if (MetadataTableUtils.hasMetadataTableName(canonicalized)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, as I see the RESTSessionCatalog does this the other way around: First tries to load the table as a regular table and then if that fails it tries to load it as a metadata table. https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java#L461
I find that approach cleaner, but let me know what you think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you suggest we call catalog::loadTable first? If so, we might not be able to cache the original table, since the underlying catalog can load the metadata table directly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I had in mind was something different:
try {
return tableCache.get(canonicalized, catalog::loadTable);
} catch (NoSuchTableException) {
// Verify it's a metadata table
// Do whatever we want in case of a metadata table, even cache the original table.
}
What do you think?

Copy link
Collaborator Author

@manuzhang manuzhang Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is noNoSuchTableException will ever be thrown on cache miss since catalog::loadTable can successfully load the metadata table. For example, in BaseMetastoreCatalog

public Table loadTable(TableIdentifier identifier) {
Table result;
if (isValidIdentifier(identifier)) {
TableOperations ops = newTableOps(identifier);
if (ops.current() == null) {
// the identifier may be valid for both tables and metadata tables
if (isValidMetadataIdentifier(identifier)) {
result = loadMetadataTable(identifier);
} else {
throw new NoSuchTableException("Table does not exist: %s", identifier);
}
} else {
result = new BaseTable(ops, fullTableName(name(), identifier), metricsReporter());
}
} else if (isValidMetadataIdentifier(identifier)) {
result = loadMetadataTable(identifier);
} else {
throw new NoSuchTableException("Invalid table identifier: %s", identifier);
}
LOG.info("Table loaded by catalog: {}", result);
return result;
}

In this case, we don't cache the original table and change the current behavior.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally I had some time to take a deeper look on this. You are right, with my above proposal we won't run the specific code for metadata tables like caching the underlying table and sharing the table operations.

I did find a way simpler version of this fix, however. I also wrote some tests in TestCachingCatalog. gaborkaszab@c1eca8b
What do you think? Would you mind if I opened an alternative fix PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please go ahead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @manuzhang !
I created this PR for the alternative fix: #11738 Added you as a co-author.
@RussellSpitzer I saw your thumbs-up on one of my comments. Would you mind taking a look at the PR?

Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jan 10, 2025
@nastra nastra added not-stale and removed stale labels Jan 10, 2025
@manuzhang
Copy link
Collaborator Author

Closed in favor of #11738

@manuzhang manuzhang closed this Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants